10 research outputs found
Crowdsourced PAC Learning under Classification Noise
In this paper, we analyze PAC learnability from labels produced by
crowdsourcing. In our setting, unlabeled examples are drawn from a distribution
and labels are crowdsourced from workers who operate under classification
noise, each with their own noise parameter. We develop an end-to-end
crowdsourced PAC learning algorithm that takes unlabeled data points as input
and outputs a trained classifier. Our three-step algorithm incorporates
majority voting, pure-exploration bandits, and noisy-PAC learning. We prove
several guarantees on the number of tasks labeled by workers for PAC learning
in this setting and show that our algorithm improves upon the baseline by
reducing the total number of tasks given to workers. We demonstrate the
robustness of our algorithm by exploring its application to additional
realistic crowdsourcing settings.Comment: 14 page
Deconfounded Causal Collaborative Filtering
Recommender systems may be confounded by various types of confounding factors
(also called confounders) that may lead to inaccurate recommendations and
sacrificed recommendation performance. Current approaches to solving the
problem usually design each specific model for each specific confounder.
However, real-world systems may include a huge number of confounders and thus
designing each specific model for each specific confounder is unrealistic. More
importantly, except for those "explicit confounders" that researchers can
manually identify and process such as item's position in the ranking list,
there are also many "latent confounders" that are beyond the imagination of
researchers. For example, users' rating on a song may depend on their current
mood or the current weather, and users' preference on ice creams may depend on
the air temperature. Such latent confounders may be unobservable in the
recorded training data. To solve the problem, we propose a deconfounded causal
collaborative filtering model. We first frame user behaviors with unobserved
confounders into a causal graph, and then we design a front-door adjustment
model carefully fused with machine learning to deconfound the influence of
unobserved confounders. The proposed model is able to handle both global
confounders and personalized confounders. Experiments on real-world e-commerce
datasets show that our method is able to deconfound unobserved confounders to
achieve better recommendation performance.Comment: 9 pages, 5 figures; comments and suggestions are highly appreciate
On the Unlikelihood of D-Separation
Causal discovery aims to recover a causal graph from data generated by it;
constraint based methods do so by searching for a d-separating conditioning set
of nodes in the graph via an oracle. In this paper, we provide analytic
evidence that on large graphs, d-separation is a rare phenomenon, even when
guaranteed to exist, unless the graph is extremely sparse. We then provide an
analytic average case analysis of the PC Algorithm for causal discovery, as
well as a variant of the SGS Algorithm we call UniformSGS. We consider a set
of nodes, and generate a random DAG where
with i.i.d. probability if .
We provide upper bounds on the probability that a subset of
d-separates and , conditional on and being d-separable; our
upper bounds decay exponentially fast to as . For
the PC Algorithm, while it is known that its worst-case guarantees fail on
non-sparse graphs, we show that the same is true for the average case, and that
the sparsity requirement is quite demanding: for good performance, the density
must go to as even in the average case. For
UniformSGS, while it is known that the running time is exponential for existing
edges, we show that in the average case, that is the expected running time for
most non-existing edges as well
Enhancing Performance on Seen and Unseen Dialogue Scenarios using Retrieval-Augmented End-to-End Task-Oriented System
End-to-end task-oriented dialogue (TOD) systems have achieved promising
performance by leveraging sophisticated natural language understanding and
natural language generation capabilities of pre-trained models. This work
enables the TOD systems with more flexibility through a simple cache. The cache
provides the flexibility to dynamically update the TOD systems and handle both
existing and unseen dialogue scenarios. Towards this end, we first fine-tune a
retrieval module to effectively retrieve the most relevant information entries
from the cache. We then train end-to-end TOD models that can refer to and
ground on both dialogue history and retrieved information during TOD
generation. The cache is straightforward to construct, and the backbone models
of TOD systems are compatible with existing pre-trained generative models.
Extensive experiments demonstrate the superior performance of our framework,
with a notable improvement in non-empty joint goal accuracy by 6.7% compared to
strong baselines.Comment: Accepted by SIGDIAL 2023 as a long pape
DialogStudio: Towards Richest and Most Diverse Unified Dataset Collection for Conversational AI
Despite advancements in conversational AI, language models encounter
challenges to handle diverse conversational tasks, and existing dialogue
dataset collections often lack diversity and comprehensiveness. To tackle these
issues, we introduce DialogStudio: the largest and most diverse collection of
dialogue datasets, unified under a consistent format while preserving their
original information. Our collection encompasses data from open-domain
dialogues, task-oriented dialogues, natural language understanding,
conversational recommendation, dialogue summarization, and knowledge-grounded
dialogues, making it an incredibly rich and diverse resource for dialogue
research and model training. To further enhance the utility of DialogStudio, we
identify the licenses for each dataset and design domain-aware prompts for
selected dialogues to facilitate instruction-aware fine-tuning. Furthermore, we
develop conversational AI models using the dataset collection, and our
experiments in both zero-shot and few-shot learning scenarios demonstrate the
superiority of DialogStudio. To improve transparency and support dataset and
task-based research, as well as language model pre-training, all datasets,
licenses, codes, and models associated with DialogStudio are made publicly
accessible at https://github.com/salesforce/DialogStudi
Salesforce CausalAI Library: A Fast and Scalable Framework for Causal Analysis of Time Series and Tabular Data
We introduce the Salesforce CausalAI Library, an open-source library for
causal analysis using observational data. It supports causal discovery and
causal inference for tabular and time series data, of both discrete and
continuous types. This library includes algorithms that handle linear and
non-linear causal relationships between variables, and uses multi-processing
for speed-up. We also include a data generator capable of generating synthetic
data with specified structural equation model for both the aforementioned data
formats and types, that helps users control the ground-truth causal process
while investigating various algorithms. Finally, we provide a user interface
(UI) that allows users to perform causal analysis on data without coding. The
goal of this library is to provide a fast and flexible solution for a variety
of problems in the domain of causality. This technical report describes the
Salesforce CausalAI API along with its capabilities, the implementations of the
supported algorithms, and experiments demonstrating their performance and
speed. Our library is available at
\url{https://github.com/salesforce/causalai}
REX: Rapid Exploration and eXploitation for AI Agents
In this paper, we propose an enhanced approach for Rapid Exploration and
eXploitation for AI Agents called REX. Existing AutoGPT-style techniques have
inherent limitations, such as a heavy reliance on precise descriptions for
decision-making, and the lack of a systematic approach to leverage try-and-fail
procedures akin to traditional Reinforcement Learning (RL). REX introduces an
additional layer of rewards and integrates concepts similar to Upper Confidence
Bound (UCB) scores, leading to more robust and efficient AI agent performance.
This approach has the advantage of enabling the utilization of offline
behaviors from logs and allowing seamless integration with existing foundation
models while it does not require any model fine-tuning. Through comparative
analysis with existing methods such as Chain-of-Thoughts(CoT) and Reasoning viA
Planning(RAP), REX-based methods demonstrate comparable performance and, in
certain cases, even surpass the results achieved by these existing techniques.
Notably, REX-based methods exhibit remarkable reductions in execution time,
enhancing their practical applicability across a diverse set of scenarios
Communication-Aware Collaborative Learning
Algorithms for noiseless collaborative PAC learning have been analyzed and optimized in recent years with respect to sample complexity. In this paper, we study collaborative PAC learning with the goal of reducing communication cost at essentially no penalty to the sample complexity. We develop communication efficient collaborative PAC learning algorithms using distributed boosting. We then consider the communication cost of collaborative learning in the presence of classification noise. As an intermediate step, we show how collaborative PAC learning algorithms can be adapted to handle classification noise. With this insight, we develop communication efficient algorithms for collaborative PAC learning robust to classification noise
Tackling Data Heterogeneity in Federated Learning with Class Prototypes
Data heterogeneity across clients in federated learning (FL) settings is a widely acknowledged challenge. In response, personalized federated learning (PFL) emerged as a framework to curate local models for clients' tasks. In PFL, a common strategy is to develop local and global models jointly - the global model (for generalization) informs the local models, and the local models (for personalization) are aggregated to update the global model. A key observation is that if we can improve the generalization ability of local models, then we can improve the generalization of global models, which in turn builds better personalized models. In this work, we consider class imbalance, an overlooked type of data heterogeneity, in the classification setting. We propose FedNH, a novel method that improves the local models' performance for both personalization and generalization by combining the uniformity and semantics of class prototypes. FedNH initially distributes class prototypes uniformly in the latent space and smoothly infuses the class semantics into class prototypes. We show that imposing uniformity helps to combat prototype collapse while infusing class semantics improves local models. Extensive experiments were conducted on popular classification datasets under the cross-device setting. Our results demonstrate the effectiveness and stability of our method over recent works